15 research outputs found

    Klasyfikacja danych niekompletnych w oparciu o komitet klasyfikator贸w

    Get PDF
    Thesis: It is possible to maintain the accuracy of classification on incomplete data by selecting a committee of classifiers based on pre-selected features. The purpose of the work was to develop a classification committee, designed to classify data in which there are features that do not have defined values. The classifier would be able to process incomplete feature vectors without the need to pre-fill them, and the classification would be based on pre-selected features. Partial objectives have been specified in the work: 1. Estimation of the impact of missing or removed features of the object on the quality of classification. 2. Developing the structure of the classification committee. 3. Selection of classifiers operating in the committee. 4. Developing a decision-making algorithm (fuser) for the classification committee. 5. Selection of distinctive features for individual object classes. 6. Testing the developed system on real data. 7. Verification of the usefulness of the developed classifier for the construction of the system for assessment of liver fibrosis in patients with hepatitis C based on the analysis of peripheral blood parameters. The dissertation investigated the influence of the presence of null values in the data on the formation of incomplete reference (training) vectors depending on the size of the subspace of features on which the component classifiers of the committee work. The impact of the missing values on the quality of the classification has also been confirmed experimentally. Based on the conclusions regarding the distribution of missing values of features among reference vectors, the structure of the classification committee was proposed, based on the division of feature space into one-element vectors. For the proposed structure of the classification committee, a number of conventional classifiers were tested as component classifiers of the committee. As a component classifier of the committee being developed, the only classifier which benefited from such a committee structure, has been chosen - the fc-NN classifier. The Bayesian averaging, supplemented by the weighting factor for individual classes of reference objects, aimed at improving the quality of classification in relation to objects that are not very numerous in the reference set, has been proposed as the method of evaluating the classification committee decision. A committee with such a structure performs initial, dynamic filtering of features, based on the vector of classified data. Features that do not have a defined value in this vector are ignored in the classification process. In order to improve the quality of classification, a method for pre-selection of features has been proposed, based on the component classifier of the proposed committee. This method uses a ranking of distinctive features for individual classes from the reference set, to indicate a suboptimal subset of the features on the basis of which the classification will be conducted. The proposed SFfc-NN/C committee classifier has been tested on a number of benchmark databases containing full real data. In order to determine the impact of the missing values of random features, in both classified and reference data, on the quality of classification, null values were artificially introduced into the data, replacing the existing values of randomly selected features. The tests were carried out without and with the initial selection of features. Ultimately, the classifier was used to classify actual medical data - blood analysis of HCV infected patients. The undefined values in this data set occurred naturally. The test results were consistent with previously obtained results on data from which some values were artificially removed. Thus, the usefulness of the proposed SFfc-NN/C classifier for the construction of the liver fibrosis assessment system in patients with hepatitis C has been confirmed. The implementation of the partial objectives made it possible to confirm the thesis of the work. This confirmation is experimental, and supported by the results of statistical tests

    The k-NN classifier and self-adaptive Hotelling data reduction technique in handwritten signatures recognition

    Get PDF
    The paper proposes a novel signature verification concept. This new approach uses appropriate similarity coefficients to evaluate the associations between the signature features. This association, called the new composed feature, enables the calculation of a new form of similarity between objects. The most important advantage of the proposed solution is case-by-case matching of similarity coefficients to a signature features, which can be utilized to assess whether a given signature is genuine or forged. The procedure, as described, has been repeated for each person presented in a signatures database. In the verification stage, a two-class classifier recognizes genuine and forged signatures. In this paper, a broad range of classifiers are evaluated. These classifiers all operate on features observed and computed during the data preparation stage. The set of signature composed features of a given person can be reduced what decrease verification error. Such a phenomenon does not occur for the raw features. The approach proposed was tested in a practical environment, with handwritten signatures used as the objects to be compared. The high level of signature recognition obtained confirms that the proposed methodology is efficient and that it can be adapted to accommodate as yet unknown features. The approach proposed can be incorporated into biometric systems

    Human Activity Detection Based on the iBeacon Technology

    Get PDF
    Paper presents a new method of patient activity monitoring, by using modern ADL (Activities of Daily Living) techniques. Proposed method utilizes energy efficient Bluetooth iBeacon BLE (Bluetooth Low Energy) modules, developed by Apple. Main advantage of this technology is the ability to detect neighboring devices, which belong to the same device family. Proposed method is based on observing changes of received signal strength indicator (RSSI) in the time domain. The RSSI analysis is performed in order to asses a human activity. Such observation may be particularly useful for monitoring consciousness of elder people, where reaction time of emergency rescuers and appropriate rescue operations may save the human lives

    Justified granulation aided noninvasive liver fibrosis classification system

    Get PDF
    According to the World Health Organization 130-150 million (according to WHO) of people globally are chronically infected with hepatitis C virus. The virus is responsible for chronic hepatitis that ultimately may cause liver cirrhosis and death. The disease is progressive, however antiviral treatment may slow down or stop its development. Therefore, it is important to estimate the severity of liver fibrosis for diagnostic, therapeutic and prognostic purposes. Liver biopsy provides a high accuracy diagnosis, however it is painful and invasive procedure. Recently, we witness an outburst of non-invasive tests (biological and physical ones) aiming to define severity of liver fibrosis, but commonly used FibroTest庐, according to an independent research, in some cases may have accuracy lower than 50 %. In this paper a data mining and classification technique is proposed to determine the stage of liver fibrosis using easily accessible laboratory data. Methods: Research was carried out on archival records of routine laboratory blood tests (morphology, coagulation, biochemistry, protein electrophoresis) and histopathology records of liver biopsy as a reference value. As a result, the granular model was proposed, that contains a series of intervals representing influence of separate blood attributes on liver fibrosis stage. The model determines final diagnosis for a patient using aggregation method and voting procedure. The proposed solution is robust to missing or corrupted data. Results: The results were obtained on data from 290 patients with hepatitis C virus collected over 6 years. The model has been validated using training and test data. The overall accuracy of the solution is equal to 67.9 %. The intermediate liver fibrosis stages are hard to distinguish, due to effectiveness of biopsy itself. Additionally, the method was verified against dataset obtained from 365 patients with liver disease of various etiologies. The model proved to be robust to new data. What is worth mentioning, the error rate in misclassification of the first stage and the last stage is below 6.5 % for all analyzed datasets. Conclusions: The proposed system supports the physician and defines the stage of liver fibrosis in chronic hepatitis C. The biggest advantage of the solution is a human-centric approach using intervals, which can be verified by a specialist, before giving the final decision. Moreover, it is robust to missing data. The system can be used as a powerful support tool for diagnosis in real treatmen

    A New Hand-Movement-Based Authentication Method Using Feature Importance Selection with the Hotelling鈥檚 Statistic

    Get PDF
    The growing amount of collected and processed data means that there is a need to control access to these resources. Very often, this type of control is carried out on the basis of biometric analysis. The article proposes a new user authentication method based on a spatial analysis of the movement of the finger鈥檚 position. This movement creates a sequence of data that is registered by a motion recording device. The presented approach combines spatial analysis of the position of all fingers at the time. The proposed method is able to use the specific, often different movements of fingers of each user. The experimental results confirm the effectiveness of the method in biometric applications. In this paper, we also introduce an effective method of feature selection, based on the Hotelling T2 statistic. This approach allows selecting the best distinctive features of each object from a set of all objects in the database. It is possible thanks to the appropriate preparation of the input data

    Follicular adenomas exhibit a unique metabolic profile. 鹿H NMR studies of thyroid lesions.

    Get PDF
    Thyroid cancer is the most common endocrine malignancy. However, more than 90% of thyroid nodules are benign. It remains unclear whether thyroid carcinoma arises from preexisting benign nodules. Metabolomics can provide valuable and comprehensive information about low molecular weight compounds present in living systems and further our understanding of the biology regulating pathological processes. Herein, we applied 鹿H NMR-based metabolic profiling to identify the metabolites present in aqueous tissue extracts of healthy thyroid tissue (H), non-neoplastic nodules (NN), follicular adenomas (FA) and malignant thyroid cancer (TC) as an alternative way of investigating cancer lesions. Multivariate statistical methods provided clear discrimination not only between healthy thyroid tissue and pathological thyroid tissue but also between different types of thyroid lesions. Potential biomarkers common to all thyroid lesions were identified, namely, alanine, methionine, acetone, glutamate, glycine, lactate, tyrosine, phenylalanine and hypoxanthine. Metabolic changes in thyroid cancer were mainly related to osmotic regulators (taurine and scyllo- and myo-inositol), citrate, and amino acids supplying the TCA cycle. Thyroid follicular adenomas were found to display metabolic features of benign non-neoplastic nodules and simultaneously displayed a partial metabolic profile associated with malignancy. This finding allows the discrimination of follicular adenomas from benign non-neoplastic nodules and thyroid cancer with similar accuracy. Moreover, the presented data indicate that follicular adenoma could be an individual stage of thyroid cancer development

    OPLS-DA results and corresponding loadings of discrimination between different thyroid lesion groups: (a) TC vs.

    No full text
    <div><p><b>NN, (b) FA vs. NN, (c) TC vs. FA</b>. </p> <p>Non-neoplastic nodules - NN, follicular adenomas - FA, thyroid cancer - TC. The color bar corresponds to the absolute value of the correlation loading in the discrimination model.</p></div
    corecore